Last updated: 2017-02-26
Outcome: Juvenile Survival
Possible predictors:
Determine the association of each predictor while "controlling" for the other predictors.
How?
\[Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_3 + \beta_4 X_4\]
To estimate the coefficient of \(X_1\):
\[X_1 = \gamma_0 + \gamma_1 X_2 + \gamma_2 X_3 + \gamma_3 X_4\]
Assumption: Residuals from OLS regression should be centered on zero and normally distributed.
What are the contributions of the fat and lactose content of mammalian milk to total milk energy?
milk <- read_excel("../data/Milk.xlsx", na = "NA")
glimpse(milk)
## Observations: 29 ## Variables: 8 ## $ clade <chr> "Strepsirrhine", "Strepsirrhine", "Strepsirrhin... ## $ species <chr> "Eulemur fulvus", "E macaco", "E mongoz", "E ru... ## $ kcal.per.g <dbl> 0.49, 0.51, 0.46, 0.48, 0.60, 0.47, 0.56, 0.89,... ## $ perc.fat <dbl> 16.60, 19.27, 14.11, 14.91, 27.28, 21.22, 29.66... ## $ perc.protein <dbl> 15.42, 16.91, 16.85, 13.18, 19.50, 23.58, 23.46... ## $ perc.lactose <dbl> 67.98, 63.82, 69.04, 71.91, 53.22, 55.20, 46.88... ## $ mass <dbl> 1.95, 2.09, 2.51, 1.62, 2.19, 5.25, 5.37, 2.51,... ## $ neocortex.perc <dbl> 55.16, NA, NA, NA, NA, 64.54, 64.54, 67.64, NA,...
Ignore for now that these are comparative species-level data.
Keep:
species: Specieskcal.per.g: Kilocalories of energy per gram of milkperc.fat: Percent fatperc.lactose: Percent lactoseFilter complete cases (drop rows with NA).
M <- milk %>% select(species, kcal.per.g, perc.fat, perc.lactose) %>%
drop_na()
names(M) <- c("Species", "Milk_Energy", "Fat", "Lactose")
glimpse(M)
## Observations: 29 ## Variables: 4 ## $ Species <chr> "Eulemur fulvus", "E macaco", "E mongoz", "E rubri... ## $ Milk_Energy <dbl> 0.49, 0.51, 0.46, 0.48, 0.60, 0.47, 0.56, 0.89, 0.... ## $ Fat <dbl> 16.60, 19.27, 14.11, 14.91, 27.28, 21.22, 29.66, 5... ## $ Lactose <dbl> 67.98, 63.82, 69.04, 71.91, 53.22, 55.20, 46.88, 3...
You must enable Javascript to view this page properly.
library(GGally) ggscatmat(as.data.frame(M), columns = 2:4)
fm <- lm(Milk_Energy ~ Fat + Lactose, data = M) summary(fm)
## ## Call: ## lm(formula = Milk_Energy ~ Fat + Lactose, data = M) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.11350 -0.05047 0.01103 0.04649 0.12701 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 1.007372 0.211168 4.770 6.16e-05 *** ## Fat 0.001952 0.002533 0.771 0.44784 ## Lactose -0.008709 0.002575 -3.382 0.00229 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.06447 on 26 degrees of freedom ## Multiple R-squared: 0.8518, Adjusted R-squared: 0.8404 ## F-statistic: 74.74 on 2 and 26 DF, p-value: 1.657e-11
You must enable Javascript to view this page properly.
Fat coefficientLactose to predict Fat, which will take the effect of Lactose out of the model when we predict Milk_Energyfm_Lact <- lm(Fat ~ Lactose, data = M) M$resid_Lact <- residuals(fm_Lact) head(M)
## # A tibble: 6 × 5 ## Species Milk_Energy Fat Lactose resid_Lact ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Eulemur fulvus 0.49 16.60 67.98 0.196070 ## 2 E macaco 0.51 19.27 63.82 -1.115660 ## 3 E mongoz 0.46 14.11 69.04 -1.279355 ## 4 E rubriventer 0.48 14.91 71.91 2.267656 ## 5 Lemur catta 0.60 27.28 53.22 -3.251415 ## 6 Alouatta seniculus 0.47 21.22 55.20 -7.416264
Fat coefficientFat coefficientFat coefficientcoef(lm(Milk_Energy ~ resid_Lact, data = M))
## (Intercept) resid_Lact ## 0.641724138 0.001952441
coef(fm)
## (Intercept) Fat Lactose ## 1.007371840 0.001952441 -0.008708827
Lactose coefficientFat to predict Lactose, which will take the effect of Fat out of the model when we predict Milk_EnergyM.fm_Fat <- lm(Lactose ~ Fat, data = M) M$resid_Fat <- residuals(fm_Fat)
Lactose coefficientLactose coefficientLactose coefficientcoef(lm(Milk_Energy ~ resid_Fat, data = M))
## (Intercept) resid_Fat ## 0.641724138 -0.008708827
coef(fm)
## (Intercept) Fat Lactose ## 1.007371840 0.001952441 -0.008708827
High correlation between predictors leaves little residual variation to be used for explaining the outcome variable.
Multiple predictors are useful for predicting outcomes when bivariate relationships with the response variable is not strong.
But:
Milk is a big energetic investment
fm_Neo <- lm(Milk_Energy ~ Neocortex, data = M) summary(fm_Neo)
## ## Call: ## lm(formula = Milk_Energy ~ Neocortex, data = M) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.19027 -0.14693 -0.03744 0.15613 0.29959 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.353332 0.501120 0.705 0.492 ## Neocortex 0.004503 0.007389 0.609 0.551 ## ## Residual standard error: 0.1764 on 15 degrees of freedom ## Multiple R-squared: 0.02417, Adjusted R-squared: -0.04089 ## F-statistic: 0.3715 on 1 and 15 DF, p-value: 0.5513
fm_Mass <- lm(Milk_Energy ~ log_Mass, data = M) summary(fm_Mass)
## ## Call: ## lm(formula = Milk_Energy ~ log_Mass, data = M) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.26908 -0.09190 -0.03189 0.13180 0.30209 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.70516 0.05185 13.599 7.68e-10 *** ## log_Mass -0.03169 0.02160 -1.467 0.163 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.167 on 15 degrees of freedom ## Multiple R-squared: 0.1255, Adjusted R-squared: 0.0672 ## F-statistic: 2.153 on 1 and 15 DF, p-value: 0.163
fm_Multi <- lm(Milk_Energy ~ Neocortex + log_Mass, data = M) summary(fm_Multi)
## ## Call: ## lm(formula = Milk_Energy ~ Neocortex + log_Mass, data = M) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.250574 -0.039212 0.000633 0.072997 0.201985 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -1.085254 0.515281 -2.106 0.05372 . ## Neocortex 0.027931 0.008015 3.485 0.00364 ** ## log_Mass -0.096402 0.024749 -3.895 0.00162 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1265 on 14 degrees of freedom ## Multiple R-squared: 0.5317, Adjusted R-squared: 0.4648 ## F-statistic: 7.948 on 2 and 14 DF, p-value: 0.004939
Regression asks (and answers):
Complete Quiz 07-3
Watch Lecture 07-4